SWE Bench AI News List

SWE Bench AI News List | Blockchain.News

AI News List

List of AI News about SWE Bench

Time	Details
2026-02-27 12:10	MiniMax M2.5 Beats Opus 4.6 on SWE-Bench Verified: 80.2% Score, 3x Faster, $1 Hour—AI Coding Benchmark Analysis According to God of Prompt on X (Twitter), MiniMax M2.5 surpassed Opus 4.6 on the SWE-Bench Verified benchmark with an 80.2% score, delivers roughly 3x faster execution, and is offered at a flat $1 per hour, while using only 10B activated parameters, positioning it as the smallest Tier-1 model for coding tasks. As reported by the same source, these metrics imply lower latency and significantly reduced inference cost, enabling 24/7 autonomous coding agents and continuous integration bots at practical budgets. According to the post, the combination of high benchmark accuracy and small active parameter count suggests strong efficiency-per-dollar, which can improve ROI for software teams deploying code assistants, test repair bots, and maintenance agents in production pipelines. Source

Time

Details

2026-02-27
12:10

MiniMax M2.5 Beats Opus 4.6 on SWE-Bench Verified: 80.2% Score, 3x Faster, $1 Hour—AI Coding Benchmark Analysis

According to God of Prompt on X (Twitter), MiniMax M2.5 surpassed Opus 4.6 on the SWE-Bench Verified benchmark with an 80.2% score, delivers roughly 3x faster execution, and is offered at a flat $1 per hour, while using only 10B activated parameters, positioning it as the smallest Tier-1 model for coding tasks. As reported by the same source, these metrics imply lower latency and significantly reduced inference cost, enabling 24/7 autonomous coding agents and continuous integration bots at practical budgets. According to the post, the combination of high benchmark accuracy and small active parameter count suggests strong efficiency-per-dollar, which can improve ROI for software teams deploying code assistants, test repair bots, and maintenance agents in production pipelines.

Source